Corpus: guj_wikipedia_2021_100K

Other corpora

4.4.1.5 Number of Word-N-grams at Sentence Endings

Number of word-N-grams for N=1...5 for the first K sentences

K # of words # of bigrams # of trigrams # of 4-grams # of 5-grams
100 38 88 98 99 99
1000 243 743 938 989 994
10000 1092 4732 8274 9676 9943
100000 5644 29900 67378 91935 98645
1000000 5644 29900 67378 91936 98646


Zipf's diagram for sentence endings


Gnuplot diagram

8255 msec needed at 2021-07-10 23:10